Automatic
Classification of Unexploded Ordnances Based on EMI Data
Grayson Zulauf
Thayer School of
Engineering, Dartmouth College, COSC 074
Project Proposal
i.
Machine
Learning Problem
Unexploded ordnance litter is an enormous environmental problem, afflicting nations worldwide. Ordnances dropped during warfare or as part of military testing contain a significant number of duds – simply, bombs that donÕt detonate upon impact. In the United States, a country that has not been the location of a major conflict in over a century, a potential unexploded ordnance (UXO) hazard exists over an estimated 11,000,000 acres of land.[1] The problem exists on a greater extent in countries that have recently played host to significant war zones.
The successful cleanup of these zones would allow the safe development of significant acreage across the world, yielding both environmental and economic benefits. Unfortunately, cleanup of these areas is extraordinarily expensive and technology is still relatively primitive, relying primarily on simple metal detection with an enormous amount of false positives.
Professor Fridon Shubitidze, an Assistant Professor at DartmouthÕs Thayer School of Engineering, has developed an innovative way to discriminate UXOs from harmless metal clutter. The method measures the time decay of the electromagnetic energy emitted by the buried bombs. The time decay curves (3 curves for each target of interest) differ between bombs and clutter, allowing for differentiation and classification.
Currently, classification is performed manually, with a human combing three times through thousands of objects, sorting clutter and unexploded ordnances. While Professor ShubitidzeÕs group has used this method to successfully identify all UXOs across the DoDÕs test sites, a robust supervised learning algorithm would significantly expedite the classification. The algorithm will receive a minimal number of Ôground truthsÕ (certain UXOs) as a training data set, and must generate zero false negatives, although some false positives are acceptable.
ii.
Suitable
Methods
A number of supervised learning methods appear to be well-suited for this classification, but I will focus on two for the project: spectral clustering and Adaboost. As a baseline, I will implement a simple least-squares error classification method, where objects within a certain (user-specified, not machine-learned) error of the training data are classified as UXOs, and the rest as clutter.
Adaboost[2] is particularly well-suited to this problem, as the algorithm is continuously updated to add weight to misclassified examples. In prior research on this project, simple algorithms proved able to correctly classify the vast majority of UXOs, but were typically unable to correctly differentiate a couple of outliers, often with similar curve shapes but shifts in energy magnitude. These false negatives are unacceptable in this application, and AdaboostÕs weighting of these misclassified targets appears to be a potentially robust solution to this problem. As a meta-algorithm, Adaboost is designed to correct the mistakes of other learning algorithms, and this projectÕs application will use it to correct the benchmark LS-regression error classification discussed above.
The second potential ML-method is Spectral Clustering,[3] an algorithm introduced in 2001 that classifies via grouping. This algorithm is easily implemented, and characterizes how well a sample matches with other samples in its Ôgroup.Õ This method would allow me to characterize the clusters around the Ôground truthsÕ as unexploded ordnances, and further provide differentiation between different types of bombs by clustering.
Because of the absolute necessity that any algorithm generates zero false negatives, one with implementation potential would almost certainly require robustness in the form of multiple checks on the results. This project aims to aid Prof. Shubitidze in his research, and the results of these algorithms will be validated against one another with this strict requirement in mind. I intend to include a discussion of the merits of using these methods in conjunction in my final paper.
iii.
Identified
Data Sets
Professor Shubitidze has provided me with multiple .mat data sets from which to work, all from the U.S. Department of DefenseÕs training sites. For these sites, Prof. Shubitidze has provided me with between 1 and 3 ground truths, as well as the necessary curves for the entire data set. Upon the testing of an algorithm, he ÔscoresÕ the items I have returned to him as identified unexploded ordnances. While these sets contain large numbers of targets of interest (all around 2,000), more are available upon request. As such, I have already obtained the necessary training data and application data, as well as the ability to score my algorithms.
iv.
Milestone
Expectations
By the milestone presentation, February 19th, I hope to have completed the code for all of the algorithms for the project and received initial results on all of the above methods. Upon the completion of these algorithms, I plan to acquire additional data from Prof. Shubitidze to further test my algorithms. The milestone should feature nearly all of the results that my final presentation will, with the exception of the discussion of the possibility of using these algorithms in conjunction with one another, and perhaps the data from one or two additional test sites.
[1] http://engineering.dartmouth.edu/emsg/
[2] Yoav Freund, Robert E Schapire, A Decision-Theoretic Generalization of On-Line Learning and an Application to Boosting, Journal of Computer and System Sciences, Volume 55, Issue 1, August 1997, Pages 119-139, ISSN 0022-0000, 10.1006/jcss.1997.1504.
[3] Ng, A.,
Jordan, M., & Weiss, Y. (2001). On spectral clustering: Analysis and an algorithm. Advances in Neural Information Processing Systems, 14.